Goto

Collaborating Authors

 nl model


NestGNN: A Graph Neural Network Framework Generalizing the Nested Logit Model for Travel Mode Choice

Zhou, Yuqi, Cheng, Zhanhong, Hu, Lingqian, Bu, Yuheng, Wang, Shenhao

arXiv.org Machine Learning

Nested logit (NL) has been commonly used for discrete choice analysis, including a wide range of applications such as travel mode choice, automobile ownership, or location decisions. However, the classical NL models are restricted by their limited representation capability and handcrafted utility specification. While researchers introduced deep neural networks (DNNs) to tackle such challenges, the existing DNNs cannot explicitly capture inter-alternative correlations in the discrete choice context. To address the challenges, this study proposes a novel concept - alternative graph - to represent the relationships among travel mode alternatives. Using a nested alternative graph, this study further designs a nested-utility graph neural network (NestGNN) as a generalization of the classical NL model in the neural network family. Theoretically, NestGNNs generalize the classical NL models and existing DNNs in terms of model representation, while retaining the crucial two-layer substitution patterns of the NL models: proportional substitution within a nest but non-proportional substitution beyond a nest. Empirically, we find that the NestGNNs significantly outperform the benchmark models, particularly the corresponding NL models by 9.2\%. As shown by elasticity tables and substitution visualization, NestGNNs retain the two-layer substitution patterns as the NL model, and yet presents more flexibility in its model design space. Overall, our study demonstrates the power of NestGNN in prediction, interpretation, and its flexibility of generalizing the classical NL model for analyzing travel mode choice.


Decision Transformer for Enhancing Neural Local Search on the Job Shop Scheduling Problem

de Puiseau, Constantin Waubert, Wolz, Fabian, Montag, Merlin, Peters, Jannik, Tercan, Hasan, Meisen, Tobias

arXiv.org Artificial Intelligence

The job shop scheduling problem (JSSP) and its solution algorithms have been of enduring interest in both academia and industry for decades. In recent years, machine learning (ML) is playing an increasingly important role in advancing existing and building new heuristic solutions for the JSSP, aiming to find better solutions in shorter computation times. In this paper we build on top of a state-of-the-art deep reinforcement learning (DRL) agent, called Neural Local Search (NLS), which can efficiently and effectively control a large local neighborhood search on the JSSP. In particular, we develop a method for training the decision transformer (DT) algorithm on search trajectories taken by a trained NLS agent to further improve upon the learned decision-making sequences. Our experiments show that the DT successfully learns local search strategies that are different and, in many cases, more effective than those of the NLS agent itself. In terms of the tradeoff between solution quality and acceptable computational time needed for the search, the DT is particularly superior in application scenarios where longer computational times are acceptable. In this case, it makes up for the longer inference times required per search step, which are caused by the larger neural network architecture, through better quality decisions per step. Thereby, the DT achieves state-of-the-art results for solving the JSSP with ML-enhanced search.


Natural Learning

Fanaee-T, Hadi

arXiv.org Artificial Intelligence

We introduce Natural Learning (NL), a novel algorithm that elevates the explainability and interpretability of machine learning to an extreme level. NL simplifies decisions into intuitive rules, like "We rejected your loan because your income, employment status, and age collectively resemble a rejected prototype more than an accepted prototype." When applied to real-life datasets, NL produces impressive results. For example, in a colon cancer dataset with 1545 patients and 10935 genes, NL achieves 98.1% accuracy, comparable to DNNs and RF, by analyzing just 3 genes of test samples against 2 discovered prototypes. Similarly, in the UCI's WDBC dataset, NL achieves 98.3% accuracy using only 7 features and 2 prototypes. Even on the MNIST dataset (0 vs. 1), NL achieves 99.5% accuracy with only 3 pixels from 2 prototype images. NL is inspired by prototype theory, an old concept in cognitive psychology suggesting that people learn single sparse prototypes to categorize objects. Leveraging this relaxed assumption, we redesign Support Vector Machines (SVM), replacing its mathematical formulation with a fully nearest-neighbor-based solution, and to address the curse of dimensionality, we utilize locality-sensitive hashing. Following theory's generalizability principle, we propose a recursive method to prune non-core features. As a result, NL efficiently discovers the sparsest prototypes in O(n^2pL) with high parallelization capacity in terms of n. Evaluation of NL with 17 benchmark datasets shows its significant outperformance compared to decision trees and logistic regression, two methods widely favored in healthcare for their interpretability. Moreover, NL achieves performance comparable to finetuned black-box models such as deep neural networks and random forests in 40% of cases, with only a 1-2% lower average accuracy. The code is available via http://natural-learning.cc.


Benchmarking the Neural Linear Model for Regression

Ober, Sebastian W., Rasmussen, Carl Edward

arXiv.org Machine Learning

The neural linear model is a simple adaptive Bayesian linear regression method that has recently been used in a number of problems ranging from Bayesian optimization to reinforcement learning. Despite its apparent successes in these settings, to the best of our knowledge there has been no systematic exploration of its capabilities on simple regression tasks. In this work we characterize these on the UCI datasets, a popular benchmark for Bayesian regression models, as well as on the recently introduced UCI "gap" datasets, which are better tests of out-of-distribution uncertainty. We demonstrate that the neural linear model is a simple method that shows generally good performance on these tasks, but at the cost of requiring good hyperparameter tuning.